Automated Planning in Repeated Adversarial Games
نویسندگان
چکیده
Game theory’s prescriptive power typically relies on full rationality and/or self–play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant–sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game’s Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model–based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two–player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament’s actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.
منابع مشابه
Robust Planning in Domains with Stochastic Outcomes, Adversaries, and Partial Observability
Real-world planning problems often feature multiple sources of uncertainty, including randomness in outcomes, the presence of adversarial agents, and lack of complete knowledge of the world state. This thesis describes algorithms for four related formal models that can address multiple types of uncertainty: Markov decision processes, MDPs with adversarial costs, extensiveform games, and a new c...
متن کاملCounterexample-guided Planning
Planning in adversarial and uncertain environments can be modeled as the problem of devising strategies in stochastic perfect information games. These games are generalizations of Markov decision processes (MDPs): there are two (adversarial) players, and a source of randomness. The main practical obstacle to computing winning strategies in such games is the size of the state space. In practice ...
متن کاملModified Adversarial Hierarchical Task Network Planning in Real-Time Strategy Games
The application of artificial intelligence (AI) to real-time strategy (RTS) games includes considerable challenges due to the very large state spaces and branching factors, limited decision times, and dynamic adversarial environments involved. To address these challenges, hierarchical task network (HTN) planning has been extended to develop a method denoted as adversarial HTN (AHTN), and this m...
متن کاملOBDD-Based Optimistic and Strong Cyclic Adversarial Planning
Recently, universal planning has become feasible through the use of efficient symbolic methods for plan generation and representation based on reduced ordered binary decision diagrams (OBDDs). In this paper, we address adversarial universal planning for multi-agent domains in which a set of uncontrollable agents may be adversarial to us. We present two new OBDD-based universal planning algorith...
متن کاملAdversarial Hierarchical-Task Network Planning for Real-Time Adversarial Games
Real-time strategy (RTS) games are hard from an AI point of view because they have enormous state spaces, combinatorial branching factors, allow simultaneous and durative actions, and players have very little time to choose actions. For these reasons, standard game tree search methods such as alphabeta search or Monte Carlo Tree Search (MCTS) are not sufficient by themselves to handle these gam...
متن کامل